222 research outputs found
Partial aggregation for collective communication in distributed memory machines
High Performance Computing (HPC) systems interconnect a large number of Processing Elements (PEs) in high-bandwidth networks to simulate complex scientific problems. The increasing scale of HPC systems poses great challenges on algorithm designers. As the average distance between PEs increases, data movement across hierarchical memory subsystems introduces high latency. Minimizing latency is particularly challenging in collective communications, where many PEs may interact in complex communication patterns. Although collective communications can be optimized for network-level parallelism, occasional synchronization delays due to dependencies in the communication pattern degrade application performance.
To reduce the performance impact of communication and synchronization costs, parallel algorithms are designed with sophisticated latency hiding techniques. The principle is to interleave computation with asynchronous communication, which increases the overall occupancy of compute cores. However, collective communication primitives abstract parallelism which limits the integration of latency hiding techniques. Approaches to work around these limitations either modify the algorithmic structure of application codes, or replace collective primitives with verbose low-level communication calls. While these approaches give fine-grained control for latency hiding, implementing collective communication algorithms is challenging and requires expertise knowledge about HPC network topologies.
A collective communication pattern is commonly described as a Directed Acyclic Graph (DAG) where a set of PEs, represented as vertices, resolve data dependencies through communication along the edges. Our approach improves latency hiding in collective communication through partial aggregation. Based on mathematical rules of binary operations and homomorphism, we expose data parallelism in a respective DAG to overlap computation with communication. The proposed concepts are implemented and evaluated with a subset of collective primitives in the Message Passing Interface (MPI), an established communication standard in scientific computing. An experimental analysis with communication-bound microbenchmarks shows considerable performance benefits for the evaluated collective primitives. A detailed case study with a large-scale distributed sort algorithm demonstrates, how partial aggregation significantly improves performance in data-intensive scenarios. Besides better latency hiding capabilities with collective communication primitives, our approach enables further optimizations of their implementations within MPI libraries.
The vast amount of asynchronous programming models, which are actively studied in the HPC community, benefit from partial aggregation in collective communication patterns. Future work can utilize partial aggregation to improve the interaction of MPI collectives with acclerator architectures, and to design more efficient communication algorithms
Averaging lifetimes for B hadron species
The measurement of the lifetimes of the individual B species are of great interest. Many of these measurements are well below the 10 level of precision. However, in order to reach the precision necessary to test the current theoretical predictions, the results from different experiments need to be averaged. Therefore, the relevant systematic uncertainties of each measurement need to be well defined in order to understand the correlations between the results from different experiments. \par In this paper we discuss the dominant sources of systematic errors which lead to correlations between the different measurements. We point out problems connected with the conventional approach of combining lifetime data and discuss methods which overcome these problems
Search for Higgs Bosons in e+e- Collisions at 183 GeV
The data collected by the OPAL experiment at sqrts=183 GeV were used to
search for Higgs bosons which are predicted by the Standard Model and various
extensions, such as general models with two Higgs field doublets and the
Minimal Supersymmetric Standard Model (MSSM). The data correspond to an
integrated luminosity of approximately 54pb-1. None of the searches for neutral
and charged Higgs bosons have revealed an excess of events beyond the expected
background. This negative outcome, in combination with similar results from
searches at lower energies, leads to new limits for the Higgs boson masses and
other model parameters. In particular, the 95% confidence level lower limit for
the mass of the Standard Model Higgs boson is 88.3 GeV. Charged Higgs bosons
can be excluded for masses up to 59.5 GeV. In the MSSM, mh > 70.5 GeV and mA >
72.0 GeV are obtained for tan{beta}>1, no and maximal scalar top mixing and
soft SUSY-breaking masses of 1 TeV. The range 0.8 < tanb < 1.9 is excluded for
minimal scalar top mixing and m{top} < 175 GeV. More general scans of the MSSM
parameter space are also considered.Comment: 49 pages. LaTeX, including 33 eps figures, submitted to European
Physical Journal
A Measurement of the Product Branching Ratio f(b->Lambda_b).BR(Lambda_b->Lambda X) in Z0 Decays
The product branching ratio, f(b->Lambda_b).BR(Lambda_b->Lambda X), where
Lambda_b denotes any weakly-decaying b-baryon, has been measured using the OPAL
detector at LEP. Lambda_b are selected by the presence of energetic Lambda
particles in bottom events tagged by the presence of displaced secondary
vertices. A fit to the momenta of the Lambda particles separates signal from B
meson and fragmentation backgrounds. The measured product branching ratio is
f(b->Lambda_b).BR(Lambda_b->Lambda X) = (2.67+-0.38(stat)+0.67-0.60(sys))%
Combined with a previous OPAL measurement, one obtains
f(b->Lambda_b).BR(Lambda_b->Lambda X) = (3.50+-0.32(stat)+-0.35(sys))%.Comment: 16 pages, LaTeX, 3 eps figs included, submitted to the European
Physical Journal
Measurement of the Michel Parameters in Leptonic Tau Decays
The Michel parameters of the leptonic tau decays are measured using the OPAL
detector at LEP. The Michel parameters are extracted from the energy spectra of
the charged decay leptons and from their energy-energy correlations. A new
method involving a global likelihood fit of Monte Carlo generated events with
complete detector simulation and background treatment has been applied to the
data recorded at center-of-mass energies close to sqrt(s) = M(Z) corresponding
to an integrated luminosity of 155 pb-1 during the years 1990 to 1995. If e-mu
universality is assumed and inferring the tau polarization from neutral current
data, the measured Michel parameters are extracted. Limits on non-standard
coupling constants and on the masses of new gauge bosons are obtained. The
results are in agreement with the V-A prediction of the Standard Model.Comment: 32 pages, LaTeX, 9 eps figures included, submitted to the European
Physical Journal
Determination of the b Quark Mass at the Z Mass Scale
In hadronic decays of Z bosons recorded with the OPAL detector at LEP, events containing b quarks were selected using the long lifetime of b flavoured hadrons. Comparing the 3-jet rate in b events with that in d u,s and c quark events, a significant difference was observed. Using Order(alpha_s squared) calculations for massive quarks, this difference was used to determine the b quark mass in the MSbar renormalisation scheme at the scale of the Z boson mass. By combining the results from seven different jet finders the running b quark mass was determined to be mb(MZ) = (2.67 +/- 0.03(stat) +0.29/-0.37(syst) +/- 0.19(theo.)) GeV. Evolving this value to the b quark mass scale itself yields mb(mb) = (3.95 +0.52/-0.62) GeV, consistent with results obtained at the b quark production threshold. This determination confirms the QCD expectation of a scale dependent quark mass. A constant mass is ruled out by 3.9 standard deviations
A Study of One-Prong Tau Decays with a Charged Kaon
From an analysis of the ionisation energy loss of charged particles selected from 110326 e+e- -> tau+tau- candidates recorded by the OPAL detector at e+e- centre-of-mass energies near the Z0 resonance, we determine the one-prong tau decay branching ratios: Br(tau- -> nu_tau K- >=0h0) = 1.528 +- 0.039 +- 0.040 % Br(tau- -> nu_tau K-) = 0.658 +- 0.024 +- 0.029 % where the h0 notation refers to a pi0, an eta, a K^0_S, or a K^0_L, and where the first uncertainty is statistical and the second is systematic.From an analysis of the ionisation energy loss of charged particles selected from 110326 e+e- -> tau+tau- candidates recorded by the OPAL detector at e+e- centre-of-mass energies near the Z0 resonance, we determine the one-prong tau decay branching ratios: Br(tau- -> nu_tau K- >=0h0) = 1.528 +- 0.039 +- 0.040 % Br(tau- -> nu_tau K-) = 0.658 +- 0.024 +- 0.029 % where the h0 notation refers to a pi0, an eta, a K^0_S, or a K^0_L, and where the first uncertainty is statistical and the second is systematic
Multiplicities of , , and of charged particles in quark and gluon jets
We compared the multiplicities of pizero, eta, Kzero and of charged particles in quark and gluon jets in 3-jet events, as measured by the OPAL experiment at LEP. The comparisons were performed for distributions unfolded to 100% pure quark and gluon jets, at an effective scale Qjet which took into account topological dependences of the 3-jet environment. The ratio of particle multiplicity in gluon jets to that in quark jets as a function of Qjet for pizero, eta and Kzero was found to be independent of the particle species. This is consistent with the QCD prediction that the observed enhancement in the mean particle rate in gluon jets with respect to quark jets should be independent of particle species. In contrast to some theoretical predictions and previous observations, we observed no evidence for an enhancement of eta meson production in gluon jets with respect to quark jets, beyond that observed for charged particles. We measured the ratio of the slope of the average charged particle multiplicity in gluon jets to that in quark jets, C, and we compared it to a next-to-next-to-next-to leading order calculation. Our result, C=2.27+-0.20(stat+syst),is about one standard deviation higher than the perturbative prediction.We compared the multiplicities of pizero, eta, Kzero and of charged particles in quark and gluon jets in 3-jet events, as measured by the OPAL experiment at LEP. The comparisons were performed for distributions unfolded to 100% pure quark and gluon jets, at an effective scale Qjet which took into account topological dependences of the 3-jet environment. The ratio of particle multiplicity in gluon jets to that in quark jets as a function of Qjet for pizero, eta and Kzero was found to be independent of the particle species. This is consistent with the QCD prediction that the observed enhancement in the mean particle rate in gluon jets with respect to quark jets should be independent of particle species. In contrast to some theoretical predictions and previous observations, we observed no evidence for an enhancement of eta meson production in gluon jets with respect to quark jets, beyond that observed for charged particles. We measured the ratio of the slope of the average charged particle multiplicity in gluon jets to that in quark jets, C, and we compared it to a next-to-next-to-next-to leading order calculation. Our result, C=2.27+-0.20(stat+syst),is about one standard deviation higher than the perturbative prediction
A Search for a Narrow Radial Excitation of the Meson
A sample of 3.73 million hadronic Z decays, recorded with the OPAL detector at LEP in the years 1991-95, has been used to search for a narrow resonance corresponding to the decay of the D*'+/-(2629) meson into D*+/- pi+ pi-. The D*+ mesons are reconstructed in the decay channel D*+ -> D0 pi+ with D0 -> K- pi+. No evidence for a narrow D*'+/-(2629) resonance is found. A limit on the production of D*'+/-(2629) in hadronic Z decays is derived: f(Z -> D*'+/-(2629)) x Br(D*'+ -> D*+ pi+ pi-) D0 pi+ with D0 -> K- pi+. No evidence for a narrow D*'+/-(2629) resonance is found. A limit on the production of D*'+/-(2629) in hadronic Z decays is derived: f(Z -> D*'+/-(2629)) x Br(D*'+ -> D*+ pi+ pi-) < 3.1 x 10^{-3} (95% C.L.
- …